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In the past twenty years a great deal of interest has developed 
in organized reading improvement at the college level. In 
1929 Parr* sent a letter to a large number of state universities, 
asking for information concerning their remedial reading pro- 
grams. Of forty schools replying, only nine reported offering 
remedial work, and seven of these merely included it as a part 
of ‘how to study’ courses. 

In 1940 Witty** surveyed one hundred thirty-one universities, 
colleges and normal schools, a number of which were known to 
give reading instruction, and found a total of only forty-one with 
active programs. He concluded that “remedial work in col- 
leges has made very little progress,’’ but pointed out that instruc- 
tors and administrators at that time were becoming increasingly 
aware of the importance of the problem and were taking more 
interest in it. _ 

The results of a more recent survey of one thousand five 





* The senior author designed the experiment and collected a part of the 
data, the junior author analyzed the data and worked through many of 
their implications,’? and this article was written jointly. Thanks are due 
Professor J. B. Stroud of the State University of Iowa for making this study 
possible and for many valuable suggestions, and to Miss Helen Price and 
Miss Hope Vollink for aid in collecting the data. Acknowledgment is due 
the University of Denver for funds for the preparation of this manuscript 
granted through the Bureau of Research in Humanities and Social 
Development. 
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hundred twenty-eight collegiate-level institutions were reported 
by Triggs*® in 1942. Of the three hundred five schools returning 
questionnaires, one hundred eighty-five offered reading instruc- 
tion. Twenty-five institutions had started their work in the 
year preceding; seventy-two had been in operation for two or 
three years; and thirty-seven for four or five years; leaving only 
twenty-one institutions reporting which had had programs prior 
to 1937. Seventy-three additional institutions reported that 
they hoped to start a reading program in the fall of 1942. 

In addition to sponsoring reading programs, schools have been 
concerned with their value. Many excellent reports have been 
published, of which the few by the following authors are men- 
tioned later in this paper: Kilby,?° Averill and Mueller,! Bond,‘ 
Hultin, * Deal,® Dearborn and Wilking,* Goldstein and Justman,?® 
Lauer,*! Remmers and Stalnaker,”* Smith,” Tyler,*® and Weber.*! 
These reports are almost uniformly optimistic, and many even 
enthusiastic. It would seem that college reading improvement 
programs are ‘successful.’ Unfortunately, most studies suffer 
from somewhat inadequate experimental design, particularly 
with respect to control of practice effects and level of motivation. 


PROBLEM 


It is the purpose of the present paper: (a) to report briefly 
a reading development program* conducted during 1944 and 
1945 at the State University of Iowa, (b) partially to evaluate 
reading gains achieved through this program, and (c) to present 
the results of an attempt to predict magnitude of gains from data 
available before instruction. This information should throw 
light on certain persistent problems in the field, and provide 
useful information for those interested in organizing or evaluat- 
ing college reading programs. 


SUBJECTS 


All freshmen entering the University in the fall of 1944 were 
given a battery of tests: (a) Iowa Tests of Educational Develop- 
ment, 1944 Edition,”* consisting of six subtests, including cor- 





*In 1944 the Communication Skills Committee of the University out- 
lined a program including reading training. Lower division credit courses 
in this area were organized by Professor James B. Stroud, and initiated in 
the fall of 1944. The present paper is based on the first year and a half of 
instructional work. 
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rectness and effectiveness of expression, interpretation of reading 
materials in the social studies, natural sciences, and literature, 
general mathematical ability, and vocabulary. Results on all 
are combined to make up the freshman composite score. (b) 
Blommers’ Rate of Comprehension test,’ consisting of twenty 
paragraphs arranged in order of difficulty, each followed by a 
question, the correct answer to which must be selected from 
several suggested answers. Rate of reading is computed only 
for those paragraphs answered correctly. This seems a priori to 
be the most adequate of college level rate tests. (c) Iowa Silent 
Reading Tests, New Edition, Forms Am and Bm," consisting of 
several short tests, of which in this case the poetry comprehen- 
sion test was not used. Analysis in the remainder of the paper is 
based on the rate and comprehension subtest, and on the median 
of the scores on all six of the subtests used. (d) Michigan Speed 
of Reading Test, Forms I and II,'* consisting of seventy-five sec- 
tions of exactly thirty words each. Each section is divided into 
two sentences, in the second of which is a word to be crossed out, 
because it spoils the meaning of the first sentence. (e) Speed of 
Perception Test,* consisting of three parts, each containing 
approximately one hundred items, calling for matching an original 
word, letter, or number with one of five similar alternative items 
given on the same line. 

From the three hundred eighty-nine students scoring below 
the fiftieth percentile on the Blommers reading test, approxi- 
mately two hundred were chosen and assigned at random to 
twelve groups. The compulsory classes so formed were dis- 
tributed over the school year, 1944-45, with new groups start- 
ing in September, November, January, and February. Each 
class met four days a week for fifty minutes for group instruc- 
tion. Each group met approximately twenty times, and all 
groups covered essentially the same material. A few volunteers 
were allowed to attend classes, and certain small groups received 
special instruction. Classes as finally constituted ranged in 
size from eight to twenty-five students, almost all falling betweén 
sixteen and twenty-one. 

Out of the total number of two hundred thirty-nine students 
who took part in the program, seventy-two were eliminated from 





* This test was developed by Professor James B. Stroud of the State 
University of Iowa. 
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the analysis of results for various reasons; (1) all volunteers 
and other students taking part who were not randomly selected, 
(2) those who did not complete the instruction, (3) those whose 
test results were incomplete, and (4) those who were given special 
instruction. This left a total number of one hundred sixty- 
seven students who represent a fairly random sample from the 
lower half of the freshman population chosen on the basis of the 
Blommers Rate of Comprehension test. The only apparent bias 
arises out of the fact that those who dropped out of school during 
training were of lower academic standing. 

In the early part of 1945, a control group of forty-two students 
was chosen at random from the remainder of the original group 
of students below the fiftieth percentile in reading who had not 
already been assigned for reading instruction. 

A few results have been included for an only roughly compar- 
able group of one hundred fifty-six students trained similarly in 
1945-46. 


PROCEDURE 


Fach instructional class was subjected to initial testing, reading 
instruction, final testing, and retesting at the end of the school 
year. Initial class testing was done with the Jowa Silent Reading 
Tests, New Edition, Advanced Test, Forms Am or Bm, and the 
Michigan Speed of Reading Test, Forms I or II. Final testing 
utilized the alternate forms of these two tests, with the addition 
of an alternate form of the Blommers reading test, and the Higher 
Examination, Form B, of the Otis Self-administering Tests of 
Mental Ability,?* the results from which are not used in this 
analysis. The original forms of the Blommers and Michigan 
tests were given to all available experimental students in April, 
1945, in an attempt to evaluate retention. This retest came an 
average of 13.8 weeks after the completion of instruction. 

Initial class test performance was motivated by an explanation 
that instruction was to be based on the test results. Final class 
testing was explained as the basis for course grading. Retesting 
was explained as another check of weak readers with possible 
repeating of reading instruction as the consequence of low scores. 

The control group was tested concomitantly with the experi- 
mental group starting in February, 1945, with the same initial 
and final tests, but no formal reading instruction. Initial testing 
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was explained as a check to determine whether the student 
should be excused from reading classes, and final testing as a 
check on persons who had done poorly on the initial test. It 
was felt that all tests for both experimental and control groups 
were given under conditions of nearly maximum motivation. 

Instruction was conducted by assistants chosen from among 
advanced graduate students with interests in reading problems. 
The program centered around the fifteen films in the Harvard 
Films for the Improvement of Reading series.”"* Each film was 
shown at least once, and could be shown as many additional times 
as the instructor found useful. It should be noted that only very 
seldom was a film shown more than twice to a given group. A 
variable speed 16 mm. projector was used, and the speed of 
showing was increased during the course from approximately 
one hundred eighty to over four hundred words a minute. The 
comprehension check questions accompanying each film were 
used and a record sheet of the results of these tests was kept by 
each student, along with the film speed, allowing him to keep a 
close check on his own progress. Limited additional instruction 
was given. Word derivations were discussed, and ‘new’ words 
brought in for class discussion. Students were held individually 
responsible for approximately ten new words a week, taken by 
them from their reading in other courses. Study methods were 
discussed, as well as techniques of comprehension through organi- 
zation. Certain sections were used from Wilking and Webster’s 
A College Developmental Reading Manual.** 

To fill in the gaps in the above program and provide for a 
variety of functional reading experiences, a set of supplementary 
exercises was developed. These covered such skills as: speed of 
phrase perception and comparison, analysis of purpose of author, 
and reasoning out the implications of information given. 

Motivation during instruction seemed high, and written atti- 
tude checks indicated an excellent understanding and apprecia- 
tion of the purpose and value of the course by the students. 


RESULTS 


1. Results from Pre-experimental Testing 


Pre-experimental scores were available for the Blommers test, 
the Michigan Speed of Reading Test, and the Iowa Silent Reading 
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Tests for the experimental and control groups. 
tant as to general level and interrelations. 
Study of Table 1 establishes the over-all picture for the experi- 
mental and control groups in comparison with test norms. It 
can be seen that the experimental group achieved average scores 


These are impor- 


TABLE 1.—PRE-INSTRUCTION TEST SCORES 















































Experimental} Control 
Group Group Norms 
(N = 167) | (N = 42) 
Refer- 
Test Stand- Stand- Stand-| ence 
ard ard ard 
Mean Devi- Mean Send. Mean ak. 
ation ation ation 
Blommers rate (standard 
a es aS aete once 43.8) 7.10) 45.4; 6.53) 54.5) 9.91 2 
Iowa Tests of Educational 
Development (composite 
of standard scores)...... 312.4) 31.4 |332.6) 35.3 |336. | 42. 19 
(29th (47th 
percentile) | percentile) 
| 
Speed of Perception (stand- 
ard scores).............| 46.5) 9.94) 51.9) 9.49) 50. | 10. 28 
(36th (59th 
percentile) | percentile) 
Michigan (number correct) | 47.2} 10.38) 50.3) 9.33) 50. 9. 12 
(43rd (55th 
percentile) | percentile) 
| | 
Iowa Silent 
Median (standard scores)| 84.3) 12.21) 90.6) 14.25) 89. | 14.6*| 15 
(39th (54th 
percentile) | percentile) 
Comprehension (stand- 
ard scores)........... 86.1) 12.93) 91.3) 14.25 
Rate (standard scores)..| 87.5) 12.75) 85.6) 10.56 
Total Rate (words/min.)|300 | 58.01/295 | 37.26 
Part C (words/min.)....|/307 | 67.95|303 | 67.86 


























* Estimated by present authors from data given in the test manual. 
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below the freshman norms on five of the initial tests. On the 
other hand, the control group, selected in the same way from the 
same population as the experimental group, is above the fresh- 
man average on the speed of perception, Michigan, and Iowa 
tests. 

The situation is further complicated by the fact that the con- 
trol group is statistically significantly superior to the experi- 
mental on freshman composite, speed of perception, Iowa total 
median, and Iowa comprehension subtest. It has not been 
possible to account for this difference in terms of any known vari- 
ables. In any case, it is felt that the differences, though statis- 
tically significant, were of small enough absolute magnitude not 
to complicate the comparison seriously. A check was made on 
this assumption by closely matching forty-two experimental 
subjects with the forty-two control subjects for freshman com- 
posite scores, and initial Blommers scores. This special experi- 
mental group, as will be shown later, gained very nearly as much 
as the whole experimental group on the Iowa and Michigan tests. 

Table 2 shows the intercorrelations of the various initial tests 
for the experimental group. Most of the high correlations are 


TABLE 2.—PEARSON PRODUCT-MOMENT CORRELATION COEFFI- 
CIENTS BETWEEN INITIAL TEST SCORES. 
(1944-45 ExPERIMENTAL GROUPS) 


‘rv &€ Fe & FF S F 


1. Initial Blommers Test....... 
2. Initial Michigan Test........ 47 
3. Initial lowa Rate SS*........ .39 .43 
4. Initial Iowa Rate Part C, 
cen dein ce anes .35 .40 .89 
5. Interpretation of Reading Ma- 
terials: Social Studies........ .34 .36 .32 .32 
6. Interpretation of Reading Ma- 
terials: Natural Science...... .40 .35 .29 .29 .62 
7. General Mathematical Ability .32 .21 .26 .25 .40 .44 
8. General Vocabulary......... .20 .20 .20 .25 .34 .34 .11 
9. Freshman Composite........ .44 .49 .39 .41 .81 .80 .56 .49 
10. Speed of Perception SS*...... hs cae BEE RAE Cam Weekes See 


For variables 1 through 9, N = 167. 

For speed of perception correlations, N = 133. 

All correlations except 7 vs 8 and 9 vs 10 are significant at the 1 per cent 
level of confidence. 

* SS means standard scores. 
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between subtests and the whole tests to which they contribute 
their scores. The relatively high correlation of Blommers with 
Michigan leads to a suspicion, borne out by the sizeable correla- 
tion between Michigan and Iowa rate subtest, that a fairly pure 
rate factor is being measured by these tests. The relatively high 
correlations of Michigan and Iowa rate with freshman composite 
may indicate either that verbal intellectual ability is important 
in these rate tests, or that rate ability plays an important part 
in this kind of verbal intellectual ability test. The most likely 
hypothesis is that both relationships hold. It is interesting to 
note in this connection that the ‘interpretation of reading’ tests 
make a large contribution to the freshman composite, while 
the speed of perception test shows very low correlations with the 
other tests. 

Certain rather important tentative conclusions seem to be 
warranted by the correlations presented: (a) all the correlations 
are positive, indicating some general underlying factor, perhaps 
‘reading ability’ or level of verbal intelligence; (b) the small size 
of the correlations indicates that for this group the tests are 
measuring rather different aspects of reading behavior; and 
finally, (c) there is a considerable need for college-level reading 
tests, possibly based on factor analysis,® to discriminate more 
adequately the various functional aspects of reading. 


2. Gains in Reading Performance 


Reading test gains during the program are summarized in 
Table 3. The wide variability of the gains (see sigmas) is prob- 
ably due to individual differences in reaction to the course and 
to a lowered reliability of the tests as applied to such a homo- 
geneous group. In spite of this variability, gains by the experi- 
mental group were consistently greater than those by the controls. 

In many studies similar to this, gains have been due in large 
part to regression effects. In the analysis of the present data, 
Blommers test results were corrected for regression,* and it was 
unnecessary to apply systematic corrections to mean gains on 
the Iowa and Michigan. The scores on these tests were not 
used as selection criteria, hence regression would only be toward 





* Where A was the standard score on the first Blommers test, and B the 
predicted standard score on the second, the equation took the form 
B = .68A + 17.4. 
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the means of our own groups, not those of the whole standardiza- 
tion population. 

Gains on Blommers Rate of Comprehension Test.—In view of 
Price’s™ finding of a positive relation (r = .35) between Blom- 
mers rate and freshmen grade point average, gains on this test 


TABLE 3.—SUMMARY OF MEAN GAINS AND SIGNIFICANCE RATIOS 
MADE ON THE BLoMMERS, Iowa, AND MIcHIGAN READING TESTS 
BY THE 1944-1945 ExPERIMENTAL AND CONTROL GROUPS 





Experimental Group 








Gain by! y = 167 N = 109 
., | Control 
ron Vast Group |Instruc 
i - ° 

a a i Differ- CR Retest 
; ence Gain 

Gain 
Blommers Rate of Mj} SS 6.1 11.2 5.1 3.8 | 14.1 

Comprehension Test SD 8.1 8.0 





Iowa Silent Reading Tests 





Median M; SS 3.6 7.6 4.0| 4.2 
SD 5.5 5.5 
Rate Subtest M|W/M| -7 107 114 12.1 
SD 49.5 | 55.6 
Rate Subtest M|W/M| —6 129 135 8.9 
(Part C) SD 69.9 | 90.9 
Comprehension M/ SS 4.6 6.5 1.9; 0.8 
SD 12.3} 13.6 
Michigan Speed of 
Reading M; SS 4.3 8.6 4.3; 3.5] 13.4 
Test SD 6.3 7.3 























All differences except for Iowa Comprehension are significant at the one- 
tenth of one per cent level of confidence. 


are of practical importance. The total experimental group 
reached a mean level of the seventy-first percentile on its final 
test.* This represented a gain of 11.2 standard score points as 





* In this paper, ‘initial’ refers to tests at the start of the year or of special 
reading instruction, ‘final’ refers to tests at the end of the special reading 
instruction period, and ‘retest’ to the tests given near the end of the school 
year to all available students who had completed the reading course at any 
time in that year. 
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compared with 6.1 by the control group, a difference statistically 
significant beyond the one-tenth of per cent level (CR of 3.81). A 
‘retest’ of one hundred nine students an average of 13.8 weeks 
after completion of the course on the ‘initial’ form of the test 
(as only two forms were available), showed a high retention, 
with eighty-seven per cent of the group scoring as high asor higher 
than on the ‘final’ test. Although this result seems to indicate 
excellent retention of reading course gains, it should be qualified 
before any final conclusion is drawn. Retention results are 
probably spriously high due to practice effect on the same test 
form, dropping out of poorer students, (retest N = 109), and 
general practice effect. The results are probably depressed some- 
what due to ‘ceiling’ effect. 

A correlation of only .37 was obtained between ‘initial’ and 
‘final’ tests, and was clearly attenuated by the varying intervals 
between the tests for the various instructional groups. A rank- 
order correlation of .75 was obtained between mean amount of 
gain and lateness in year of instruction for eleven groups. This 
could be due to a number of factors, including improved instruc- 
tional procedures, and general freshman gain similar to that 
reported by Gladfelter.* Since the control group was given its 
‘final’ test approximately twenty-four weeks after the start of 
the program, while the experimental groups were tested ‘finally’ 
after an average of twenty weeks, the experimental gains prob- 
ably are relatively greater than the comparison used would 
indicate. 

Gains on Iowa Silent Reading Tests.—Iowa test scores are of 
particular interest in view of the positive relationship shown 
between them and college grades by Kilby.2®° The median sub- 
test score on the Iowa showed an average gain of 7.6 standard 
score points for the experimental and 3.6 for the control group, 
with a CR of 4.17. Thus there was a significant general reading 
gain by the reading-training group on the Iowa. This is made 
even more convincing by the comparison of the results for the 
forty-two matched experimental subjects and the control group. 
The control group gained from the 53rd to the 64th percentile, 
while this experimental group increased from the 52nd to the 
70th percentile (gain difference significant beyond the 1 per cent 


level). 
Comparisons of total experimental and control subjects can 
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also be made as to gains on Iowa subtests. The experimental 
made larger gains on location of information and rate subtests 
significant beyond the one per cent level, as well as sizeable com- 
prehension gains which were not statistically significant due to 
the large intra-group variabilities. Greatest comprehension 
gains were made by those with initially high rate and low com- 
prehension scores. The experimental group gained one hundred 
fourteen more words per minute than the controls, but the ceiling 
reached on this, as well as several other of the subtests, doubtless 
decreased the size of the gain. This gain by the experimental 
group represents a gain of only thirty-five per cent in absolute 
speed whereas a gain of eighty-six per cent was made in ‘free’ 
reading. (Discussed in ‘equated reading’ section.) Analysis of 
initial and final rate scores of the control subjects showed that 
practically no gain was made on this test by them. Thus the 
Iowa rate test would appear to be the least affected by practice 
of the tests here used. 

The Iowa tests are particularly useful in that they allow rela- 
tively independent measurement of rate and comprehension. 

Gains on the Michigan Speed of Reading Test.—On the Michigan 
test, the total experimental group gained 8.4 in number of items 
attempted, while the control group gained only 4.7 (CR of 3.30). 
The experimental subjects made an average gain from the 43rd 
to the 76th percentile. The total experimental group gained an 
average of 8.6 items correct, while the controls gained only 4.3 
(CR of 3.50). Thus the experimentals made a slight relative 
gain in ‘comprehension’ or accuracy. It is interesting to note 
that the special experimental group matched with the controls 
gained from the 57th to the 90th percentile, while the controls 
gained from the 56th to the 72nd (difference in gains significant 
beyond the one per cent level). The retest scores for the experi- 
mental group (N = 109) after an average of 13.8 weeks showed 
almost complete retention of the gains made on the final test. 
Unfortunately, here again the ‘retention’ may be spuriously high 
due to practice effect, repetition of the pre-training form of the 
test, and dropping out of school of the poorer readers. 

Gains on the Harvard ‘equated reading’ series.—The fifteen read- 
ing selections accompanying the fifteen Harvard films are quite 
useful for evaluating reading in a relatively free situation. It is 
possible to keep a record of speed gains with a check on compre- 
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hension based on the tests accompznying the selections, since 
the tests are uniformly of ten items, and the selections are fairly 
well matched as to difficulty and length. 

Our procedure called for the student to read the selection at 
his own rate, then take the appropriate comprehension test. 
Constant instructor pressure was maintained for increases in rate 
and comprehension, but it was made clear that these exercises 
were only for practice and gains would not directly influence the 
grade in the course. Thus, although students kept their own 
records, there was little or no incentive for exaggeration of gains, 
and instructors rather carefully checked recording. 


TABLE 4.—MEAN ScorES IN RATE AND COMPREHENSION ON 
Eacu OF FIFTEEN EQUATED TRANSFER READING SELECTIONS 
MaAbpE By 119 oF THE STUDENTS WHO Took PART IN THE 1944-45 


PROGRAM 
Mean Rate Score Mean Comprehension 
Selection No. (Wds/Min) Score 
1 274 6.2 
2 289 6.5 
3 328 6.1 
4 354 6.6 
5 341 7.2 
6 383 7.8 
7 374 7.1 
8 438 6.1 
9 421 6.1 
10 453 6.7 
11 476 75 
12 479 5.9 
13 508 6.4 
14 495 6.0 
15 494 6.3 


Analysis was made of the graphs of one hundred nineteen 
students who had kept complete records and whose recordings 
appeared fairly accurate. Mean rate and comprehension scores 
are given in Table 4. It can be seen that the curve for rate 
increased for the first thirteen selections, then flattened out, while 
comprehension remained about the same. The gain was about 
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eighty-six per cent of the original rate, which could be interpreted 
as a gain of eighty-six per cent in rate of comprehension. 

After careful consideration, Bond‘ concluded that the gains of 
Harvard and Yale students during programs similar to the one 
here reported were due to speed rather than comprehension 
gains. Hultin'* found that although his group made a large gain 
in rate, it made only a one-and-one-half per cent gain in compre- 
hension. It seems reasonable that with fairly intelligent stu- 
dents, rate can be easily modified, whereas comprehension gains 
are to a considerable degree due to gain in general knowledge or 
practice in skills specific to that particular test used to measure 
comprehension. 

Failure to gain on Selections 14 and 15 could possibly be 
attributed either to decreased motivation with the approach of 
the end of the reading course, or to limit reached in ability to 
speed up. The facts that several groups did not show this 
flattening, and that the 1945-46 group (discussed in next section) 
with noticeably poorer motivation showed flattening after only 
eight selections, seem to favor the decreased motivation hypothe- 
sis. The bend in our curve after the thirteenth selection appears 
too sharp to indicate a ceiling. In any case, the steady rise 
seems to indicate that the limits of gain in rate of reading were 
much higher than might be anticipated. In this connection it 
might be noted that five initially superior students read several 
selections at rates of over one thousand words per minute in the 
latter part of the course with perfect comprehension scores. 

Of special interest to those persons using the equated reading 
series are the fluctuations in the comprehension test scores. 
Mean scores (Table 4) vary from 5.9 to 7.8 items correct out of a 
possible ten. No relation could be found between individual 
spurts in speed and comprehension test results. Test difficulties 
were approximately the same (rank order correlation of .71) 
for the 1944-45 and 1945-46 groups, indicating consistent fluc- 
tuations in difficulty of reading selections, tests, or both. 

1945-46 Instructional Group.—At the beginning of the 1945-46 
school year, a group of two hundred nine freshman students was 
chosen for instruction, on the basis of being below the 25th per- 
centile on the initial Blommers test. As these subjects were in 
general not directly comparable to the 1944-45 group, their 
results have been included in the present paper only in a very 
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limited way and only where comparisons seemed justified. 
Instructional procedures were much the same, except that classes 
met only twice a week, but thirty times instead of twenty. 
Measures of gain seemed in several instances to favor the earlier 
group which had received more concentrated instruction. The 
1944-45 group showed greater absolute gains on the Michigan, 
Iowa rate, and equated reading speed tests. 

Students and instructors alike reported rather low levels of 
motivation, in comparison with the previous year. It cannot be 
positively concluded, however, that the more concentrated pro- 
cedure is superior, since there were also changes in instructional 
personnel, instructional procedures, and student ability levels. 


3. Prediction of Reading Rate Gains 


The second major objective of this paper was that of predicting 
gains in rate of reading. In order to diversify and individualize 
instruction, it is desirable to be able to predict who will and who 


will not profit by a given program. 


To attack this problem, a number of variables were selected 
which seemed most likely to be related to the amount of reading 
gain. These are listed below: 

1) Initial test scores on the Blommers Rate of Comprehension 
test, and the Iowa Silent Reading Tests. The relationship of 
these variables to the reading gains obtained has been mentioned 
before to a limited extent. 

2) Four of the Jowa Tests of Educational Development and the 
composite score for all six of the tests. The four tests used were: 
Interpretation of Reading Matertals—Social Sciences, Interpreta- 
tion of Reading Materials—Natural Sciences, General Mathe- 
matical Ahility, and General Vocabulary. 

3) Speed of perception test scores. 

4) Discrepancies between various test scores. In each case 
the difference between the scores on the two tests used was taken 
as the measure of the discrepancy, except that the composite 
score was divided by six before subtracting the second test score. 

These initial scores were next correlated with the gains on the 
three reading tests. Since no systematic differences were noted 
in the initial freshman examination scores of the various instruc- 
tional groups, it was felt that a correlation based on all the scores 
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TABLE 5.—PEARSON PRODUCT-MOMENT CORRELATION COEFFI- 
CIENTS BETWEEN READING GAINS AND BETWEEN READING GAINS 


AND MEASURES AVAILABLE BEFORE INSTRUCTION 

















Gains 
i cee “ Blom- Michi-| Iowa 
ne gan Rate 
Rate (No. | Sub- 
right) | test 
Gains 
Blommers Gain............ 167 —.10 .02 
Michigan Gain............ 167 13 
Iowa Rate Subtest Gain....| 167 
Initial Test Scores 
os enw ks hades 167 |—.20* | —.09 .02 
Ee, ae 167 .20* | —.45*| —.02 
Iowa Rate Subtest......... 167 .21* | —.06 | —.31* 
Iowa Rate Subtest, (C)..... 167 .19**| — .04 | —.31* 
Interp. R. Mat. Social St....| 167 > x .01 | —.04 
Interp. R. Mat. Nat.Se.....| 167 12 | —.12 .03 
General Mathematics...... 167 .06 | —.05 .04 
General Vocabulary........ 167 .05 .05 | —.04 
Freshman Composite....... 167 .26* | — .06 .02 
Speed of Perception........ 133 {|—.05 | —.03 | —.07 
Discrepancies 
Gen. Math.—Gen. Voc.....| 167 .O1 | —.08 .07 
Gen. Voc.—Blommers...... 167 .21* .11 | —.03 
Composite/6—Blommers...| 167 .42* .07 .00 
Gen. Voc.—Perception..... 133 .08 .06 .07 
Gen. Math.—Perception....| 133 .09 | —.02 .06 
Blommers—Perception..... 133 |—.05 | —.05 .02 
Michigan—Perception...... 133 .13 | —.29*| .08 

















Correlations marked * are significant at the 1% level. 
Correlations marked ** are significant at the 5% level. 
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of the 1944-45 group would be as valid as the ‘within-groups’ 
method. 

Results of this analysis are shown in Table 5. In the first 
place, the gains on the three major rate of reading tests used are 
not significantly correlated. It is also apparent that only twelve 
of the fifty-four r’s are significant at even the five per cent level of 
confidence, so that few of the factors here considered were related 
to gains. Furthermore, no single factor is statistically sig- 
nificantly correlated in the same direction with the gain on more 
than one of the tests, so prediction of general reading gains from 
these factors is not feasible. 

Examination of individual correlations points up several fur- 
ther findings. A consistent tendency appears for gains on a test 
to be significantly correlated negatively with the initial scores on 
that test: —.20 for Blommers, —.45 for Michigan, and —.31 for 
Iowa rate subtest. Although this indicates that the poorest 
gain the most on any given test, the small size of the correlations 
allows for large gains at any level. Michigan and Iowa rate 
subtest scores correlate positively, but there is no statistically 
significant relationship between Michigan and Iowa gains and 
scores on the Blommers. Again, it is evident that the tests are 
not measuring the same thing. Blommers gain is best predicted 
from the size of the discrepancy between the initial Blommers 
and the composite (a good measure of verbal intelligence), while 
Michigan and Iowa rate gains are best predicted from the initial 
scores on these tests alone. 

It should be apparent that the present tests cannot be used 
efficiently either to predict or to measure gains. The results 
here reported can be useful chiefly as hints to prediction, or leads 


for further investigation. 


4. General Implications of the Findings 


(a) The tests used in this program appear unequal to the 
specialized task of accurately measuring and predicting gains in 
reading ability for the relatively homogenous group of students 
likely to be found in many college reading programs. Low 
initial inter-test correlations indicate a need for more general and 
valid tests in this area. Estimated test-retest reliabilities for 
these tests tend to be very low. Correlations for the forty-two 
control subjects from initial test to final test serve as rough 
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estimates of the reliabilities for the whole group. The correla- 
tions obtained were: Iowa rate, .68; Blommers, .58; and Michi- 
gan, .78. Ceilings are too low for even moderately good readers. 
Seven persons finished the final Michigan test, over sixty per 
cent completed at least one part of the Iowa rate subtest, and 
almost all at least attempted all of the Blommers items. The 
nature of the Blommers test is such that this is probably not too 
serious in its case. 

There is evidence that the norms for the tests could profitably 
be revised better to fit present college populations. The control 
group in this experiment was chosen as below average on the 
Blommers rate test, but turned out to be ‘above average’ on 
the Michigan and Iowa tests according to current norms. The 
1945-46 experimental group (N = 156) was chosen as below 
the 25th percentile on the Blommers rate test and the average of 
the group’s Iowa median scores fell at the 34th percentile. In 
spite of this, the group averaged at the 53rd percentile on the 
Iowa rate sub-test which should have been the subtest on which 
they scored lowest, considering the criterion of selection. 

(b) The over-all gains of the experimental over the control 
group may seem to be rather small. Considering the probable 
reliability and consequently low validity of the tests, the uni- 
formity of gain gives quite satisfactory testimony to the effective- 
ness of the program. Correlations were doubtless attenuated to 
some degree by a ‘between-groups’ effect, and by the homogeneity 
of the total group. 

It is apparent that other than intellectual factors must con- 
tribute considerably to a gain. It is evident from Table 4 that 
verbal intelligence per se as indicated by vocabulary and fresh- 
man composite scores contributes very little to the prediction of 
reading gains. By exclusion, then, it would seem that personal- 
ity factors are quite important and must be taken into consider- 
ation. That this line of investigation may well prove profitable 
is indicated by Goodsell’s'! finding of a correlation of approxi- 
mately .50 between ‘adjustment’ scores from a paper-and-pencil 
test and reading scores on the high-school level. 

(c) Above all, it is necessary in this type of study of reading 
gains to avoid as much as possible certain common pitfalls. 
Examination of the pertinent literature shows frequent failure 
to correct for regression effects either by use of a control group 
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or by statistical manipulation. Without making this correction, 
one could within wide limits obtain increasingly larger ‘training’ 
gains by selecting people with increasingly lower initial test 
scores. A similar problem arises when practice during training 
essentially duplicates the materials and processes used in the 
evaluation tests themselves. This can give very misleading 
results, and again emphasizes the need for more generally valid 
college reading tests. 

The gains by the control group in this study point to perhaps 
the most frequent error made in evaluating reading program 
gains, i.e., practice effects on test= and the general gains to be 
expected during college attendance are often interpreted as due 
specifically to reading programs. One of Dearborn and Wilk- 
ings’® ‘control’ groups showed a gain from the 14th to the 39th 
percentile on a speed-of-reading test. Gladfelter® shows statis- 
tically significant reading gains for ‘untreated’ freshmen between 
September and June. Weber’s*! matched control group on 
delayed retesting showed in one case almost two-thirds the gain 
of the experimental group. Any of these results under other 
circumstances might easily have been used to ‘prove’ the value 
of some particular reading program. 

(d) One of the greatest dangers to the long-range effectiveness 
of reading programs is the low level of the goals set themselves 
by the directors of such programs. No real limit has yet been 
found to the practical improvement of reading on the college 
level. Iowa instructors found that gains in their classes were 
closely related to their own expectations. Individuals encour- 
aged to gain fifty per cent in rate gained about that, while other 
individuals encouraged to gain over one hundred per cent gained 
over one hundred per cent. Gains seem to reflect reading skills 
aimed at instruction. If we wish to have general reading gains, 
we must provide training in each of the major reading skills singly 
and in combination with all the others. 

Table 6 shows the gains achieved in a series of reading programs 
at least roughly comparable to that reported in this paper. Only 
rate gains in relatively free reading are included. Gains by 
groups given special training range from twenty-five to ninety- 
nine per cent. It would be easy to consider a fifty-six or an 
eighty-six per cent gain quite satisfactory. On the other hand, 
a superior reading program might well aim at one hundred to one 
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hundred fifty per cent gain in absolute reading ability. Table 6 
shows that most reading programs come nowhere near the maxi- 
mum gain here reported. An even more impressive finding is 
that of Hultin'’® whose highly-motivated and rather highly 
selected upper-class students made an average gain of one hun- 
dred ninety per cent in rate. Thus it is hard to predict how much 
reading gain might be achieved under optimum conditions of 
ability, motivation, and instruction. 


TABLE 6.—GAINS IN RATE OF READING IN TRAINING PROGRAMS 
SIMILAR* TO THAT REPORTED IN THIS PAPER 


Reference Per Cent Gain Over Initial 
(See Bibliography) Speed 

1 99 
6 8 
26 25 

10 52, 33, 48 
27 54 
21 35 
Present experimental group 86 


* Programs roughly comparable as to initial performance level of stu- 
dents, method of selection of students, time devoted to instruction, methods 
of instruction, and in that measurement was of something approaching 
‘free’ reading. 


SUMMARY AND CONCLUSIONS 


Under a comprehensive basic communications program, one 
hundred sixty-seven college freshmen completed the work in a 
required twenty-hour reading improvement course giving aca- 
demic credit. This group was compared with a control group of 
forty-two students who were given the same tests, but no reading 
instruction. Findings can be summarized as follows: 

1) The experimental group made statistically and practically 
significant gains over the control group in scores on the Blom- 
mers’ Rate of Comprehension Test, the median of six subtest 
scores from the Iowa Silent Reading Tests, and the Michigan Speed 
of Reading Test. 

2) The experimental group showed a gain in rate of reading 
(actually rate of comprehension) from an average of two hundred 
seventy-four to an average of five hundred eight words per minute 
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on the Harvard equated transfer reading selections with no 
apparent loss in comprehension. This represents a rate gain 
of eighty-six per cent. A survey of results from six similar read- 
ing programs showed gains ranging from eight per cent to ninety- 
nine per cent. 

3) An additional group of students with a greater number of 
class meetings but with classes meeting less frequently showed 
similar but smaller gains, both on tests and equated reading 
selections. 

4) Increases in rate of reading could not be predicted with any 
appreciable gain over chance accuracy from initial scores on 
reading tests or freshman entrance examinations. Only twelve 
of the fifty-four correlations were statistically significant at 
beyond the five per cent level of confidence, and none was larger 
than +.45. 

5) Test-retest reliabilities (V = 42) on Blommers’ Rate of 
Comprehension Test, the Iowa Silent Reading Tests, and the 
Michigan Speed of Reading Test ranged from .58 to .78, much too 
low for individual prediction. Ceilings were reached on these 
tests by many students. 

It can be concluded that: 

1) It is possible to secure very large gains in rate of reading as a 
result of college reading programs. Many programs have been 
satisfied with small gains which could easily have been due to 
factors other than specific training. Twenty hours of training 
can reasonably be expected to produce rate gains of 75 per cent 
to 100 per cent in ‘free’ reading. 

2) Present tests of reading at the college level seem to lack 
reliability and hence validity. Their ceilings are too low, and 
the choice of subtests has been made on an inadequate a priori 
basis. Tests which are presumed to measure essentially the 
same aspects of reading ability show low intercorrelations. New 
or revised college reading tests are needed. 

3) With present types of paper-and-pencil tests, it does not 
appear possible to predict gains in rate of reading resulting from a 
training program. 
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THE INTERPRETATION OF PRINCIPAL-AXIS 
FACTORS 


FREDERICK B. DAVIS 
George Peabody College for Teachers 


In a recent issue of THE JOURNAL OF EDUCATIONAL PsYCHOL- 
ocy, Holzinger illustrated a meaningful and useful presentation 
of the results of a principal-axis analysis. It is the purpose of 
this article to relate Holzinger’s presentation to certain methods 
of interpreting the results of principal-axis analyses which the 
writer has employed. For the convenience of the reader, 
Holzinger’s illustrative example and notation will be used. 
Formulas and tables in this article have been numbered in such 
a way as to avoid confusion with those presented by Holzinger. 

Factors pa and ps may be expressed as linear combinations of 
the two original test variables, z and y. Bordered by the 
variances of the two original variables and the two independent 
factors, the direction cosines obtained by analysis may be 
presented in tabular form like Holzinger’s Table 1. 


TABLE 1 
Da Ps Variance 
ER ne em SE ee Eee cos 6 — sin 6 Vs 
|| ESET ee RE ine ee sin 6 cos 6 Vy 
0 ee V4 Up Total 


The numerical data used by Holzinger were obtained by analy- 
zing a matrix of variances and covariances in which the diagonal 
entries were the variances of the original test variables. In 
practical work it would be unusual to construct a matrix for 
analysis in this manner because only rarely would the importance 
of the variables in a quantitative representation of the complex 
to be analyzed correspond closely to the relative magnitudes of 
their variances. However, for illustrative purposes, we may 
assume that variable x is judged to be 2.25 times as important 





1K. J. Holzinger, ‘‘A Comparison of the Principal-Axis and Centroid 

Factors,” J. Educ. Psychol. xxxvu, No. 8 (November 1946), pp. 449-472. 

Since the writer has not repeated derivations given by Holzinger, the 
reader should have the latter’s article at hand for reference. 
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as variable y or that some objective bases for the relative weights, 
such as the magnitudes of empirically obtained regression coeffi- 
cients, exist. The numerical equivalent of Table 1 is given in 
Holzinger’s Table 1’. 


TABLE 1’ 
Da Ps Variance 
ea ptm a ia ota 2s) .8663 — .4995 225 
Oey cdi Reva date dtdaa natu dae .4995 . 8663 100 
RN i ile oe Sha 287 .3 37.7 325 


As Holzinger points out, the direction cosines shown in Table 1’ 
are not susceptible of easy interpretation. But, since they offer 
the most convenient means for computing individual factor 
scores, it is important that they be presented. 

Often at this stage of a factorial study, the investigator is 
interested in identifying the psychological meaning of the factors 
he has obtained.? Therefore, he wants to know the weight of 
the mental function measured by each of the original test vari- 
ables in determining individual scores in each of the factors.* 
For this purpose, it is often helpful to know the weight that would 
have to be attached to standard measures in each of the original 
test variables to obtain individual scores in each factor. Since 
x = 0,2, and y = o,Z,, we may rewrite Holzinger’s equations (5) 
as follows: 


Pa = (cos 00,)z, + (sin O0,)2, (5a) 
ps = (= sin 602)2: + (cos Ooy)2y 


The terms enclosed in parentheses in equations (5a) are the 
desired weights. Each one consists of a direction cosine multi- 
plied by the standard deviation of the corresponding original test 
variable. For the illustrative problem, the numerical data are 
given in Table la. 





2 The writer hopes that the current tendency to name or identify factors 
on the basis of subjective judgment will give way to the practice of identi- 
fying factors objectively in terms of their social utility; that is, in terms of 
their efficiency for measuring useful human aptitudes and achievements. 

3 In practice, it would be desirable at this time to determine which factors 
had variances greater than might be obtained by chance (at, say, the five- 
per cent level). 
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TABLE la 
Zs Sn Variance 
| Ee eee Care p en ae 12.99 5.00 287 .3 
Re ies Ki etn Ue basemen 4 —7.49 8.66 37.7 
PD, 4 GN saath ae 225 100 325 


The relative weights of the correlated mental functions 
measured by the original test variables are correctly shown in 
Table la, but a word of caution concerning the interpretation of 
these data must be provided. Since original test variables x 
and y are correlated, standard scores in these two variables 
(z, and z,) are similarly correlated. Hence, the relative weights 
of standard scores in variables z and y for determining scores in 
factors pa and pz reflect the influence of the correlation of vari- 
ableszandy. This is an obvious point, but easily overlooked. 

The relative weights shown in Table la are meaningful in a 
very practical sense because they show the actual weight of the 
correlated mental functions, as measured by variables z and y, 
in determining scores in factors ps and ps. If the investigator 
wants to know the contribution of each of variables z and y to 
the variance of factors pa and pz, the weights in Table la should 
not be used to obtain that information. 

The variances of factors pa and pz are given by Holzinger in 
his equations (6). The analogous equation for the variance 
of component pa, as the latter is defined above by equation (5a), 
is as follows: 


va = (cos? 6vz)v,, + (sin? Ov,)v., 
+ 2(cos 602)0,,(sin O0y)o2,72,2, (6a) 


Since o,, = o,, = 1, equation (6a) may be rewritten as follows: 


Va = COS 60,(cos Oo, + SiN Ooyrs,2,) 
+ sin 6c,(cos 0c272,2, + sin Oo,) (7a) 


The contribution of variable x to the variance of factor pa 
is given by the term on the first line of equation (7a); the con- 
tribution of variable y to the variance of factor pa is given by 
the term on the second line of equation (7a). The numerical 
result for the illustrative problem is as follows: 


Va = 12.99[12.99 + (5.00)(.72)] + 5.00[(12.99)(.72) + 5.00] (8a) 
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or: 
va = 215.5 + 71.8 = 287.3 (9a) 


215. . 
In other words, sary or seventy-five per cent of the variance 


; i 71. 
of factor pa, is attributable to original test variable z and aay 


or twenty-five per cent of the variance of factor pa, is attributable 
to original test variable y. Similar procedures indicate that 
the variance of factor ps may be written by an equation analogous 
to (9a) as: 


ve = 9.4 + 28.3 = 37.7 (9b) 


If the reader will glance at Holzinger’s equations (10) he will 
see that they yield identical numerical results (within rounding 
errors) for the illustrative problem. Starting with one method of 
presenting factorial data used by the writer, we have obtained 
the type of data recommended by Holzinger, thus showing the 
relationship of the two types. 

Commenting on factorial solutions employing principal factors 
in deviate and standard form, Holzinger remarks:‘ “‘the factoring 
of variances and covariances greatly ‘exaggerates’ the relative 
importance of factors . . . ”’ and concludes that the “‘psycholo- 
gist will probably prefer the analysis of correlations and unit 
test variances because he can make his tests about equally long 
and thus may consider them all equally important in the sub- 
sequent factor analysis.’”’ In this connection the writer would 
like to call attention to another consideration which he thinks 
psychologists should take into account when they construct 
matrices for purposes of factorial analysis. If the tests to be 
factored are weighted equally for purposes of analysis, the 
investigator is either explicitly or tacitly making the assumption 
that all of the tests are equally important in the complex of 
abilities that is being analyzed. If, for example, the complex of 
abilities called mechanical comprehension was being investigated, 
the assumption would be that each test entered in the matrix of 
intercorrelations of tests believed to measure mental functions 





* Loc. cit. p. 458. 
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involved in mechanical comprehension was as important as 
every other test. The writer doubts that this assumption of 
equal importance can ordinarily be justified. The appropriate- 
ness of the assumption can be tested in many instances by having 
a group of competent judges indicate independently the impor- 
tance of the mental function measured by each test proposed for 
use in the complex of abilities to be analyzed. When the pooled 
judgment of competent authorities indicates that the mental 
functions to be analyzed are significantly different in importance 
(as will ustially occur), the writer believes that the tests should 
be entered into the matrix used for factorial analysis with vari- 
ances made proportional to their importance as indicated by 
the pooled judgment of authorities.» This immediately leads 
to the analysis of a matrix of variances and covariances instead 
of an analysis of correlations and unit test variances. Properly 
interpreted (as, for example, by the method recommended by 
Holzinger), analyses of matrices of realistically weighted vari- 
ances and covariances can be entirely meaningful to psychol- 
ogists, and may often be the method of choice in factorial 
analysis. The writer must emphasize that in making this state- 
ment he is in no sense questioning the meaningfulness or appro- 
priateness of other procedures used in factorial analyses. His 
point is simply that the analysis of weighted variances and 
covariances is the general procedure of which the analysis of 
unit variances and correlations is a specific case to be used when 
the circumstances warrant. 

A second method of presenting the results of principal-axis 
factorial analyses which the writer has used consists in showing 
the correlation coefficients between each of the original test 
variables and each of the factors obtained by analysis. Making 
a specific application of the general formula for the correlation 
of a weighted sum and a single variable,* we may write for one 
of these correlations in the case of our illustrative example: 


Ya(zx cos 6 + y sin @) 
'sm = 10a 
"*  /3zx? V3(z cos 6 + y sin 6)? aes 














5 Vid., T. L. Kelley, Essential Traits of Mental Life. Cambridge: Harvard 
University Press, 1935, p. 93. 

‘Cf. T. L. Kelley, Statistical Method. New York: Macmillan Co., 1924, 
Formula 149. 
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Simplifying and substituting, we obtain: 





—" oz COs 8 oe Sin Ofrzy | (11a) 


For our illustrative example, the numerical result is: 


_ 16.591 _ 


Te. = 16.949 ~ °9°9 


By analogous procedures the correlation coefficient between 
each of the initial test variables and each factor may be com- 
puted. For the illustrative example the resulting data may be 
presented as in Table 2a. 

In actual practice (especially as the number of test variables 
is increased), the correlation between each factor and each 
initial test variable would be obtained by multiplying each 
direction cosine by the standard deviation of the corresponding 
factor divided by the standard deviation of the corresponding 
initial test variable. In terms of the notation used for our illus- 
trative problem: 





ne = COS @ = Tap, (12a) 
- sin O0pn _ sin a = Tsp, (13a) 
i oe 56 m Yo, (14) 
2086) sin 8 = Fo (150) 


Equations (12a), (13a), (14a), and (15a) follow immediately 
from Holzinger’s equations (23)? and from the data as repre- 
sented in Figure 4 of Holzinger’s article. It will be noted that 
the absolute values for sin a, cos a, sin 8, and cos 8 which Holz- 
inger provides® are identical to three decimal places with the 
values of the corresponding correlation coefficients presented in 
this article in Table 2a. 





7Immediately preceding these in Holzinger’s development there is a 
slight typographical error in the first of equations (22), which should read: 
z = (co, COS a)z: — (cz SiN @)Z22. 

§ Loc. cit. p. 467. 
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TABLE 2a.—CoRRELATION COEFFICIENTS BETWEEN PRINCIPAL 
FACTORS AND INITIAL TEST VARIABLES 


Da Ps 
Oc a alee DET ee en oe ke .979 — .204 
is slacks anal Ghd , Sneed BAe eden Ss dae ries .847  .532 


For psychologists most familiar with the results of factorial 
analyses performed by the centroid method, the presentation of 
the results of a principal-axis analysis in terms of correlation 
coefficients (as in Table 2a) may prove helpful. It must be 
remembered, however, that factor scores for individuals may not 
be computed directly from the data in Table 2a. To accomplish 
this, given the coefficients in Table 2a, it would be necessary to 
solve for regression coefficients capable of yielding factor scores. 
For factor pa, for example, one may proceed as follows to compute 
the required coefficients: 


B, = 222 —Terslev _ 7666 





By = ua — Tel _ 9950 





= .8663 


> 
| 

» 
| 


o 
_ 
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WD 
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= .5000 


The multiple correlation coefficient between scores in factor 
pa and scores in variables z and y is found to be: 


Ry,.2 = V(.7666)(.979) + (.2950)(.847) = 1.000 





For factor ps, the coefficients are as follows: 


6, = —1.2203 

8B, = 1.4109 
and in deviation form: 

b, = —.4995 


by 
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The multiple correlation coefficient between scores in factor 
Pe and scores in variables xz and y is found to be: 


Roy,- zy = V(—1.2203)(—.204) + (1.4109)(.532) = 1.000 


The reader will have noticed that in deviation form the coef- 
ficients in the multiple regression equations for ‘estimating’ 
scores in factors pa and pz from deviation scores in variables x 
and y are identical with the direction cosines obtained by factorial 
analysis and presented in Table 1’. This is naturally the case 
since factors pa and ps are composites of variables z and y 
weighted in terms of the direction cosines. The multiple corre- 
lation coefficient must, therefore, turn out to be unity in each 
case, which explains why the word ‘estimating’ was placed in 
quotatidn marks earlier in this paragraph. These facts point 
up an important characteristic of the direction cosines obtained 
by the principal-axis analysis of variances and covariances; 
namely, that they are, in one sense, coefficients in a multiple 
regression equation that will ‘predict’ the criterion (the factor) 
perfectly. As Holzinger has said, ... ‘This property is 
one of the chief virtues of the principal-axis solution. ... ”’ 
The fact that the direction cosines obtained by factorial analysis 
and presented in Table 1’ can be used to obtain individual scores 
in each factor explains why the writer has always presented them 
even though they cannot conveniently be used for interpreting 
the psychological meanings of the factors. They constitute the 
most directly useful products of a factorial analysis and should 
be published together with various interpretive data, such as 
those recommended by Holzinger in his Table 2. 

A property of the direction cosines that the writer hopes will 
become commonly known in the future is their convenience for 
computing the correlation of scores in a given factor with an 
external criterion, thus permitting objective evaluation of the 
social utility of factor scores for vocational and educational guid- 
ance and for many other purposes without the necessity of 
computing and validating the factor scores empirically. If 
the validity coefficients of the original test variables are known, 
a modification of the usual formula for the correlation of a 
weighted sum with a criterion variable may be employed.® If 








%Ibid. Formula 149. 
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scores are in deviation form, this formula, in the general case for 
factor j defined as a weighted sum of n original test variables, is 
as follows: 


Toi = 1 (0) (wiz+way++ + + +wnn) 


cs Z(0)(wit + wey + + + + Wan) 
~ V0? VY U(wiz + wey + °° + + wan)? 








(16a) 


This simplifies to: 





i W102! oz + Wy oy + Des + W nF nT on (17a) 


oj 
oj 


For the illustrative example used throughout this article, the 
specific applications of formula (17a) would be: 


COS Bo27 oz + SIN Bay oy 








Ter, = ra (18a) 
— sin 60.7.2 + cos 60,7 
Tors = mx 2 (19a) 
B 


It is apparent from an inspection of formulas (17a), (18a), and 
(19a) that: the computation of validity coefficients for principal 
factors can be accomplished very rapidly if validity coef- 
ficients for the original test variables are available. The pro- 
cedure in no way interferes with the analysis of a matrix in which 
the test variances have been weighted according to their judged 
importance, as is the case when the criterion variable or variables 

/ are included in the matrix. 

An interesting property of the variance of a principal-axis 
factor is that the portion of it attributable to any one of the 
original test variables has a correlation coefficient with an external 
criterion identical with that between factor scores and the 
external criterion. This can be demonstrated very simply, 
using data pertaining to our illustrative example. 

As shown by Holzinger’s equations (4), the original test vari- 
ables may be written: 


x2 = pa cos 0 — pz sin 0 


y = pasin 6 + ps cos 0 (4) 


The correlation of the criterion with that part of test z 
attributable to factor pa is given by the equation: 
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2(0)(». Cos 6) 








T(0)(pacosé) = a/ 20? V3, cos 6)? (20a) 
This immediately reduces to: 
T(o)(p, cos 0) = Top, (21a) 


Similarly, the correlation of the criterion with that part of 
test y attributable to factor pa reduces to: 


T(o) (pgsin®) = Top, (22a) 
Hence, 
T(o)(p,cos 0) = Tlo)(pysin 0) = Top, (23a) 


This result indicates that the variance of a principal factor is 
homogeneous with respect to its correlation with a given external 
criterion, so that in this sense, at least, factor scores may be 
regarded as ‘pure’ scores. 

Once the validity coefficient of a factor has been computed, the 
contribution of the portion of the variance of each initial test 
variable that is accounted for by each factor can readily be 
ascertained. For our illustrative example, the validity coeffi- 
cients of variables x and y may be written as: 


cos 00» ,T op, so sin 00 pT ops 


Oz 
, 24a 
SIN O00 p,Top, + COS Oop,T ops (24a) 


Cy 





or 





Toy = 


Equations (24a) may be expressed as: 


_ COS OGy,Top, SIN Oop, op, 


o Co 
ee ” (25a) 
SIN O¢p,Top, , COS O0p,1op, 
Tou = + 
Cy Oy 








or 








Each term on the right-hand side of equations (25a) represents 
the contribution of one original test variable to the factor validity 
coefficient. 

Equations analogous to (25a) may be derived for use with 
factor loadings that take the form of correlation coefficients 
rather than direction cosines. If in equations (25a) we substitute 
for cos 6 and sin @ values derived from identities (12a), (13a), 
(14a), and (15a), we obtain: 
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(26a) 
7 Cyl yp pal op, , TyT ype? pg! ope 
wy = 
T py T pgFy 
These expressions immediately simplify to: 
Tor = Tep,l op, ~~ "rpg" ops 
Toy = Typ,Tor, + TypgT ors (27a) 


Equations (27a) would be most convenient for use with the results 
of a centroid analysis, whereas equations (25a) would be most 
convenient for use with the results of a principal-axis analysis. 
It should be noted, however, that unless all of the nonchance 
variance of the original test variables is accounted for by the 
factors, equations analogous to (24a) are not correct. If non- 
chance variance remains unaccounted for by the factors, equa- 
tions analogous to (25a) and (27a) may yield underestimates of 
the obtained validity coefficients of the original variables. A 
comparison of the computed and obtained validity coefficients 
can always be made to detect any such discrepancies. 

One more use of the direction cosines obtained from a principal- 
axis analysis should be mentioned. They make possible, at the 
expense of some labor, the estimation of the reliability coef- 
ficient for scores in each factor, provided that the reliability 
coefficients of the original test variables have been obtained. 
The method for accomplishing this has been published by the 
writer and will not be repeated here.’° It is worth mentioning, 
however, that the computation of reliability coefficients for factor 
scores is not confined to factors obtained by principal-axis 
methods. If regression coefficients for estimating factor scores 
obtained by the centroid method have been calculated, these 
coefficients may be used to obtain validity coefficients for the 
estimated factor scores from equation (17a) as well as reliability 
coefficients of the estimated factor scores. Whether unit vari- 
ances or communalities were employed in the diagonal cells of 
the matrix is immaterial. 





10 F, B. Davis, “The Reliability of Component Scores,” Psychometrika, x, 
No. 1 (March 1945), pp. 57-60. 











VOCABULARY ABILITY IN LATER MATURITY! 


CHARLOTTE FOX 


Northwestern University 


INTRODUCTION 


The importance of determining accurately the vocabulary 
level of older persons becomes increasingly apparent when we 
consider that tests for the measurement of mental deterioration 
employ vocabulary level as an indication of maximal intellec- 
tual functioning. Previous studies?*’ fail to agree, however, as 
to the stability of vocabulary ability in the over-sixty age range. 

The purposes of the present study are; (1) to compare the size 
of the basic and estimated total vocabularies of a group of indi- 
viduals between the ages of seventy and seventy-nine with that 
of an intellectually comparable group of persons between the 
ages of forty and forty-nine, (2) to compare the relative diffi- 
culty among the older group of demonstrating word knowledge 
by definition and recognition criteria, (3) to investigate the feasi- 
bility of using group testing techniques with persons in the later 
maturity age range, and (4) to compare the quality of the defini- 
tions given by subjects in the two age groups. 

In previous studies, Sorenson’ and Christian and Paterson? 
reported no loss in vocabulary at the upper age levels. Shakow 
and Goldman,® on the other hand, report slow decline of ability 
beyond the age of sixty. They administered the 1916 Binet list 
of fifty words from a vest pocket dictionary to subjects between 
the ages of eighteen and ninety, using a representative sample of 
occupational levels. Educational level was used as a means of 
equating the age groups for intellectual level, and successively 
older decades had a decreasing mean number of years of school- 
ing, ia accordance with the trend of educational attainment in 
the United States during the years under consideration. The 
authors assumed that keeping the same mean number of years 
of schooling throughout the age range would result in a gradually 
increasing intellectual level in the older ages. 





1 This study was carried out under the direction of Dr. Robert H. Seashore, 
whose assistance is gratefully acknowledged. The author’s present address 
is: Unit on Gerontology, U.S. Public Health Service, Baltimore City 


Hospitals. 
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Vocabulary Ability in Later Maturity 


PROCEDURE 


A. Subjects.—Several criteria were set up in the selection of 
subjects. Individuals used in this study were white men and 
women with no speech defects or incapacitating loss of vision or 
hearing, whose native language was English. For the purposes 
of the present study, the period of later maturity was defined 
as beyond sixty years of age. The age range used here for the 
experimental group was the decade 70-79, since any vocabulary 
loss which may be taking place should be present in clear-cut 
amount by that age. The range from forty to forty-nine was 
chosen for the control group because it is late enough in life to 
allow for most or all of the adult increase in word knowledge 
such as is reported by O’Connor,' and yet is too early to overlap 
with the period of suspected downward trend. 

Subjects were selected so that their mean educational level 
would approximate as closely as possible that of the population 
of that age decade, as reported by Shakow and Goldman.* In 
computing educational level, this study follows earlier investi- 
gators and credits either completion of or some attendance at a 
grade in determining the number of years of schooling. The 
mean education of our experimental group is 7.7 years, as com- 
pared with 7.1 for Shakow and Goldman’s sample of the seventy 
to seventy-nine year age group; the mean for the control sample 
is 10.7, as compared with Shakow and Goldman’s mean of 9.3. 
Shakow and Goldman considered their work as having been done 
in 1930. The education means quoted here are for their age 
groups of sixty to sixty-nine and thirty to thirty-nine, respec- 
tively, which are ten years older during the decade 1940-49. 

Subjects were chosen from several communities in an attempt 
to achieve a representative socio-economic sampling. About 
half of the older subjects were individuals who lived by them- 
selves or with their families, and the other half were chosen from 
three old peoples’ homes, including a county home. Subjects 
were rated on the Barr scale of occupational intelligence.}° 
Since the majority of the control subjects were housewives, they 
were rated on the basis of previous occupation in all cases in 
which such information was available. The subjects in the 
older group, all of whom are believed to be retired from gainful 
employment, were rated on the basis of their occupations during 
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the greater portion of their lives. Twenty-three subjects were 
rated in each group, the other seven in each case being housewives. 

The characteristics of the two groups are summarized in 
Table I. 


TABLE I.—CHARACTERISTICS OF THE EXPERIMENTAL AND 
CONTROL GROUPS 




















Exp. Con. 
Mean | SD | Mean | SD 
Age 74.5 | 2.5 | 44.1 | 3.1 
Education (this sample) 7.7 | 3.31) 10.7 | 2.7 
Education (Shakow and Goldman); 7.1 | 2.5| 9.3 | 3.8 
Barr rating 10.08 10.34 
Sex: Men 13 8 
Women 17 22 
Total N 30 30 








B. Tests Used.—The word lists used in this study were 
drawn from the English Recognition Vocabulary Test of Sea- 
shore and Eckerson.* A word, for the purposes of that test, 
was defined as an item in the Funk and Wagnalls’ New Standard 
Dictionary of the English Language, two volume edition of 1937.* 
A basic word, the type under consideration in this study, is 
defined as a word in heavier type and next to the margin. 

Three methods of testing were employed in this experiment: 
group recognition (multiple choice), individually administered 
recognition, and definition lists were given. The lists were 
chosen from the one hundred seventy-three word Seashore- 
Eckerson test to give three comparable fifty-word lists of graded 
difficulty. The corrected split-half reliability of the one hundred 
seventy-three word test was found by its authors to be .83, so 
that the three lists employed in this study should be sufficiently 
comparable for use with group data. 

C. Administration —The group test consisted of mimeo- 
graphed lists so arranged as to be easily read and comprehended. 
It was administered to groups of from two to thirty-one persons, 
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with the exception of two tests in the control group which were 
given individually. Some orientation was usually given as to 
the purpose of the testing, and the blanks calling for name, edu- 
cation, occupation, age, and English language background were 
explained. Subjects were encouraged to do their best, and were 
instructed to guess when they did not know a word. The exam- 
iner repeated the instructions to any individual who did not under- 
stand what to do, and pronounced any words when requested to 
do so. The subjects were allowed as much time as they needed, 
which averaged about twenty-five minutes for the group test. 

Individual tests were administered at the homes or places of 
employment of the subjects at times convenient to them. The 
words for the individual tests were lettered on five-by-eight-inch 
eards. On the defirition list, the experimenter held up the card 
with the word, pronounced the word, and asked the subject to 
define it. The test was continued until five consecutive words 
were missed. The individual recognition list was given at the 
same session, and the entire fifty-item list was administered. 
The examiner held up the card and read the stimulus word and 
the four choices. The average time for the two individual tests 
was about forty-five minutes. 

The two recognition lists were scored with a correction for 
guessing, as described by Seashore and Eckerson;® namely, num- 
ber of test items —(1.33 x errors). For the three cases in which 
this resulted in a score of less than zero, the scoring used was: 
number of items attempted — (1.33 x errors). 


RESULTS 


The mean scores and standard deviations for the three word 
lists—group recognition, individual recognition, and definition— 
are given in Table II. The differences in variability here are 
rather large, but were deemed not large enough to invalidate the 
statistical analyses employed. 

The technique of analysis of covariance was first applied to the 
data, in order to control the effect of the difference in education 
between the two groups beyond that which would equate them 
according to Shakow and Goldman’s* sampling. A correction of 
2.2 was subtracted from the years of education of each subject in 
the younger group, this number being the difference between 9.3 
and 7.1, the mean years of education of the samples used by 
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Shakow and Goldman for the age groups here employed. The 
results of the ¢ test comparisons, which indicate that there are no 
significant differences between the two age groups on any of the 
three word lists, are presented in Table III, A. 

In order to ascertain the significance of differences in perform- 
ance of each group on the three tests, an analysis of variance was 
performed for each group separately. In these analyses, the 
three sets of test scores for each group were treated as paired 
measures in order to take into account the correlation between 
test scores for individuals. A ¢ test of each mean difference was 
then carried out, applying the error estimate obtained from the 
analysis of variance in each case. The only comparisons which 
were not significant beyond the one-per-cent level are between 
group recognition and definition for the older subjects and 
between group and individually administered lists for the 
younger group. The ¢ test data are shown in Table III, B. 


TABLE IJ.—MEANS AND STANDARD DEVIATIONS FOR GROUP 
RECOGNITION, INDIVIDUAL RECOGNITION AND DEFINITION 
Worp Lists 
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These analyses provide answers to the first three problems of 
this investigation. Since no significant differences between the 
means of the two age groups were found for either the individually 
administered multiple choice word list (Individual Recognition) 
or the other word lists, we may assume that the average size of 
the basic and total vocabularies was essentially the same for the 
two age groups. 

The average basic vocabulary, as measured by the individual 
recognition list, was 52,900 words for the later maturity group 
and 55,100 words for the control group. The estimated total 
vocabularies for these two age groups were 128,000 and 144,000 
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words, respectively. A comparison may be made between the 
total vocabularies as determined for these two groups of sub- 
jects and the total vocabulary of 155,736 words which was found 
by Seashore and Eckerson to be the average for college under- 
graduates, keeping in mind the differences in years of school 
attended. 


TaBLE III.—Comparison oF AGE DIFFERENCES AND TEST 
METHODS 
A. Analysis of Covariance 
t Test Comparisons: ¢ = 1.974 at 5 Per Cent and 2.605 at 1 Per 


Cent. 
Experimental Group: Control Group: 
GR-IR: 2.348 Sig. 5% GR-IR: 0.684 Not Sig. 
IR-D: 2.782 Sig. 1% IR-D: 2.539 Sig. 5% 
GR-D: 0.435 Not Sig. GR-D: 1.855 Not Sig. 


Experimental and Control: 
GR: 1.602 Not Sig. 
IR: 0.061 Not Sig. 
D: 0.458 Not Sig. 
B. Analysis of Variance 
t Test Comparisons: ¢ = 2.002 at 5 Per Cent and 2.663 at 1 Per 


Cent. 
Experimental Group: Control Group: 
GR-IR: 3.041 Sig. 1% GR-IR: 1.401 Not Sig. 
IR-D: 3.604 Sig. 1% IR-D: 6.505 Sig. 1% 
GR-D: 0.563 Not Sig. GR-D: 5.105 Sig. 1% 


The second part of the problem was the comparison of the 
older group on definition and recognition criteria of word knowl- 
edge. A difference between the means of the individual recog- 
nition and definition tests was significant at the one-per-cent 
level of confidence for the later maturity group (Table III, B). 
There was also, however, a similar difference between the two 
tests for the control group, and, in addition, there was no sig- 
nificant difference between the two ages on either test. The con- 
clusion may be drawn, therefore, that people in general regardless 
of age (at least for the age range tested) find it more difficult to 
state the meaning of a term than to choose a synonym from 
among four alternatives. 
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The third object of the experiment was to investigate the 
feasibility of using group testing techniques with older persons. 
There was a significant difference between the group and indi- 
vidual recognition scores for the later maturity group, with a 
correlation of .75 between these two lists. The group score for 
the older subjects, however, was not significantly lower than that 
for the younger subjects, so that the difference between the group 
and individual scores for the older subjects may be due to a chance 
factor. The conclusion may be drawn that the printed group 
test is adequate if a group survey of alert older subjects is desired, 
but that individual testing methods will give a more accurate 
estimate of maximal functioning, particularly with older subjects 
who have difficulty in attending to and following instructions. 
~~ The fourth object of this study was to compare the quality of 
the definitions given by the two age groups. This was done by 
scoring the words as plus, one-half, and minus credits, with the 
total definition score equal to the number of plus credits, plus 
one-half the number of half credits. Since no scoring norms were 
aVailable, the definitions were compared with the dictionary 
definition, the correct choice from the Seashore-Eckerson recog- 
nition list, and the total range of definitions given by the present 
subjects. 

Since there was no significant difference between the two age 
groups on the total definition scores, the relative quality of the 
responses for the age groups was investigated by comparing the 
mean number of half-credits for the two age groups, in this case 
7.47 for the older and 6.20 for the younger group. This was 
accomplished through an analysis of covariance, with the finding 
that the resulting F was not significant at the five-per-cent level 
of confidence. In other words, the quality of the definitions was 
not seen to decline with increasing age of the subjects. 


DISCUSSION 


Intelligence, education, and socio-economic status form a 
group of interrelated variables which influence the size of the 
vocabulary which an individual acquires. Intelligence is prob- 
ably the one factor most closely related to vocabulary size, and 
has been found to correlate more highly with total mental age 
on the Stanford Binet than does any other single Binet subtest, 
with an r of .81 for an adult group reported by Weisenberg, Roe 
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and McBride” and correlations for single age groups of from .65 
to .91 and averaging .81 reported by Terman and Merrill.'! 
Wechsler" reports a correlation of .85 between the vocabulary 
subtest and the total Wechsler-Bellevue scale. 

Since it would have been impracticable to try to give intelli- 
gence tests to the subjects in this study, educational level was 
chosen as the best alternative. Lorge‘ reports correlations of 
from .44 to .66 between last grade of school completed and 
scores on the Otis. Wechsler! states that for studies previous to 
his, the correlation between number of years of attendance at 
school and intelligence test scores ranges in most cases from .60 
to .80; for the adult population used by Wechsler, the corre- 
lation is .64, or .53 when mental defectives are omitted. While 
these correlations are of only moderate degree, years of education 
seems nevertheless to be the most accurate indirect means avail- 
able at present for estimating intellectual level. 

Shakow and Goldman? report a correlation of .64 between 
education and vocabulary, but they believe that the relationship 
is due almost entirely to the indirect effect of mental level, with 
persons of higher mental level tending to have more years of 
schooling. The correlations between education and vocabulary 
score found in this study were .48 for the later maturity group 
and .68 for the control group. 

The results of this study tend to disagree with the findings of 
Shakow and Goldman,® who reported a significant decrease of 
measured vocabulary ability in the years past sixty. The 
present results also seem to require a modification of their 
hypothesis that the absence of a loss in the work of previous 
investigators”? was due to the method of equating the educational 
level which resulted in a higher intellectual level in the older 
group. 

A possible, explanation of the difference in results between this 
work and that of Shakow and Goldman’ may lie in the relative 
total numbers of words in the dictionaries from which the test 
samples were chosen, our larger*dictionary sample giving a fuller 
opportunity to show a knowledge of any word. Seashore and 
Eckerson® point out that in previous estimates of total size of 
English non-technical vocabulary, the size of the estimated 
vocabulary is roughly proportional to the size of the dictionary 
from which the sampling was taken, and that only the largest 
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dictionaries include all the words which a person might know. 
None of the previous investigators had used a dictionary as large 
as the 450,000 word Funk and Wagnalls Unabridged edition on 
which the Seashore-Eckerson test is based. Additional sampling 
variables not considered in either study may also be a factor. 

The results of this experiment tend to corroborate previous 
work which led to the rationale for the use of vocabulary as a 
stable factor against which to measure the deterioration of 
mental processes. It has been suggested by Ackelsberg! on the 
basis of her work with senile dementia cases, however, that vocab- 
ulary declines with progressive degree of mental deterioration in 
these patients and may be used in itself as a measure of deteriora- 
tion rather than as an ability which is constant and unchanging 
during the deteriorative process. Shakow, Goldman, and 
Dolkart’ also found a greater decline in senile psychotics than in 
normal old people. 

Yacorzynski!® suggests that individuals evidencing mental 
deterioration may decline with respect to the number and 
quality of the definitions which they are able to give for any 
particular word. The present study fails to demonstrate any 
decline in quality of definitions among a group of older persons 
who are assumed to be deteriorated only in the amount normal 
for their age. 

Both definition and recognition criteria of word knowledge 
were tested in this study in order to investigate the possibility 
that rigidity in older people might make it hard for them to earn 
good scores on multiple choice tests, where they are offered only 
one correct alternative even for words that have multiple mean- 
ings. Our results do not, however, bear this out. 

An attempt was made in this experiment to secure as represen- 
tative as possible a sampling of each age group population. The 
greatest error seems to be in socio-economic status as reflected in 
the Barr ratings of 10.08 and 10.34, as compared with the more 
representative mean rating of 8.3 in the study by Weisenberg, 
Roe, and McBride.'* The importance of socio-economic status 
in studies of vocabulary is pointed out by Thorndike and Gallup, '? 
who found successive increases in the word knowledge of persons 
of increasing family income levels. 

It should be noted here that the definitions in the present 
experiment were scored only in half-steps and by only one judge. 
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The experimenter is of the opinion, however, that an analysis of 
the definitions into finer discriminations than half-steps would 
probably yield little overall change in results. 


SUMMARY 


1) Vocabulary lists were administered to thirty subjects 
between the ages of seventy and seventy-nine years and to thirty 
subjects between the ages of forty and forty-nine years who 
constituted a sample of the general population at those ages with 
regard to amount of formal education and socio-economic status. 

2) Subjects were given a printed multiple-choice word list in a 
group situation, and were later tested individually on an oral 
multiple-choice list and an oral definition list. 

3) No significant differences were found between the two age 
groups on any of the three tests. The definition task was 
significantly more difficult than the individual recognition task 
for both age groups. 

4) The results fail to indicate conclusively the feasibility of 
group testing with older people, but wide variability on the 
group test suggests that testing methods should be adapted to 
the specific group under consideration. 

5) A statistical analysis of the half-credit definition scores for 
the two age groups indicates that there is no significant difference 
in the quality of the definitions given by the older and younger 
subjects. 
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AN EXPERIMENT IN RATING 
EDUCATIONAL PSYCHOLOGY TEXTBOOKS 


RAY H. SIMPSON 
Associate Professor of Educational Psychology 


University of Illinois 


How should textbooks be rated? Who should share in an 
evaluation of the probable worth of a textbook to certain groups 
of learners? How may rating of materials be used to increase 
an understanding of such materials and to improve the ability 
critically to select and use appropriate materials? With these 
questions in mind, the writer decided to experiment with two 
classes of eighty-one graduate students* in Advanced Educational 
Psychology. 

Some months before the classes were started seven of the best 
and most recently published books which the instructor thought 
would have value as reading material in educational psychology 
classes were selected. Five of the books selected are educational 
psychologies. One of the books selected, Psychology Applied to 
Life and Work, approaches ‘‘the problems of life and work as an 
applied psychologist writing for the layman.’”’ Another book, 
Successful Teaching, while not strictly an educational psychology 
is an attempt to provide “‘the bridge between our psychological 
knowledge and the practical teaching job.” 

On the first day of class, each student was asked to acquire at 
least one of the books on the following list: 


Woodruff. Psychology of Teaching. Longmans, 1946. 

Skinner. (Ed.) Educational Psychology (Revised). Prentice- 
Hall, 1946. 

Pressey and Robinson. Psychology and the New Education. 
Harper, 1944. 

Mursell. Successful Teaching. McGraw-Hill, 1946. 

Kingsley. The Nature and Conditions of Learning. Prentice- 
Hall, 1946. 

Hepner. Psychology Applied to Life and Work. Prentice- 
Hall, 1946. 

Gates et al. Educational Psychology. Macmillan, 1942. 





* Classes included 11 administrators, 16 elementary teachers, 48 high 
school teachers, four college teachers, and two prospective teachers. 
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BACKGROUND FOR TEXT RATINGS 


As a background for a consideration of the rating of the texts, a 
brief picture of the purposes of the course from the instructor’s 
and learners’ standpoints will next be given. 

As will be seen by the brief outline that follows, the course was 
not a conventional learning of textbook information. Rather, 
an attempt was made to start with identification and selection of 
individual teacher and administrator problems and then see what 
ideas and principles from educational psychology and elsewhere 
could be used in the analysis of such problems and in getting 
possible solutions to them. Each student was given a mimeo- 
graphed sheet containing the points listed below under the 
heading: 

PURPOSES OF COURSE AS SEEN BY INSTRUCTOR 


[To Help Each of You Individually To Do More Effective and 
Satisfying Professional Work] 


1) To help you to identify present and prospective problems 
you face or are likely to face. 

2) To help you to select for study those problems upon which 
it is most desirable for you to work at this time. 

3) To help you to learn better how to attack systematically 
professional problems so that you will not only get tentative or 
possible solutions to some specific problems but will learn the 
process of intelligent problem-solving. 

4) Reaching the first three goals with you will involve guided 
practice in: 

More effective personal record keeping 
Self-evaluation, including self-diagnosis 

Getting and using appropriate resources 

Developing effective democratic inter-personal relations 
Developing more purposeful and economical reading 
skills needed for recognizing and solving professional 
problems. 

5) To help each of you objectify for yourself (1) your long-time 
professional goals, and (2) your short-time professional goals. 
With this is tied up the need to see both: (1) What kinds of steps 
we hope to take in our schools eventually and (2) Next steps 
we can and should take in our professional activities. 
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6) To help each of you to see applications of the teaching and 
learning methods used here to later teaching and learning situ- 
ations in which you may participate. 


The Best Teacher Is the One Who Teaches the Learner How To 
Teach Himself. We Will Try To Teach You Auto-education. 





Each student was encouraged with guidance to consider pro- 
fessional problems which seemed of importance to him and 
attempt to determine through criteria set up with other students 
and the instructor upon which of these problems it would be 
desirable for him to work. Thus, the learner got some guided 
practice in one of the first steps in effective learning, that of 
intelligently deciding what to study. 

After each learner had selected one or more professional 
problems which seemed of significance to him and to the instruc- 
tor, work was started on getting possible solutions to these 
problems using conventional educational psychology material 
and other material which gave promise of giving help. In the 
latter category came reports from classroom or school situations 
in which principles of effective learning seemed to be illustrated. 

In an attempt to help the learner see practical applications of 
the principles of educational psychology, an annotated bibliog- 
raphy of over one hundred titles was given to each student. 
These references pictured actual teacher and administrative 
experimentation and experience in trying to make use of edu- 
cational psychology and its teachings. 

The instructor tried to teach principles of effective learning 
through practice as well as through precept. Individual differ- 
ences were not only recognized and talked about, but at least 
eighty-five per cent of all out-of-class work was actually individ- 
ualized. The class served as an example of how individual 
differences could be provided for to a significant degree in large 
groups. A serious attempt was made to practice what was 
preached. 


RATINGS OF TEXTBOOKS 
Each student was encouraged to use at least six of the seven 
texts in the following ways: (1) To get help in identifying possible 
problems upon which he should work, (2) To gather ideas for 




































" | ee ee ee eee 


le 


| 
: 





496 The Journal of Educational Psychology 


possible solutions to the problems which seemed important to 
him, and (3) To get an overview of the field of educational 
psychology so he would know the type of problems upon which 
it could give him help. Each text was to be rated primarily 
upon the extent to which the learner felt it helped him in points 
(1) and (2) above, and secondarily in point (3). 

To help the learner crystallize his reactions to the text, he was 
asked to write a short book review. At the end of the review he 
was requested to give a rating on a nine point scale with nine 
indicating greatest value, one the least value, and other numbers 
of the scale intermediate values. Students were encouraged to 
hold ratings till several books had been examined so that a 
mental yardstick on which to rate the books could be built up. 
The results of these ratings are shown in Table 1. 


TABLE 1.—StTuDENTs’ RATINGS OF TEXTBOOKS 





Number of Students Giving 











Rating of: Number 
Average Rating : of Students 
(High) (Low) Rating 
Text 
91'/81';71'61;5;}/4/3;,2)]1 
7.9 Pressey 18} 42; 13] 3] 2] 1 79 
7.6 Mursell 29;13;) 9); 4; 4;3!2/1 65 
7.5 Skinner 15| 27; 17) 8] 3} 2) 1 73 
7.0 Gates 12; 20; 22} 9} 6) 6) 1 76 
6.8 Woodruff 14] 18] 14/16} 10} 3} 4} 1 80 
6.74 Kingsley 6/15) 24/)15) 8] 5 73 
6.69 Hepner 15} 5] 6} 6) 6) 2;1);3)]1 45 



































On the last day of the course, each student was asked in an 
anonymous questionnaire to give his opinion on the three ques- 
tions which are indicated below with the results in each case 


after the question: 
1) Which text that you have used* would you prefer to have 


in your professional library? Second choice? 





*It should be recalled that eighty used Woodruff, seventy-nine used 














Rating Educational Psychology Textbooks 497 


Combination of Ist. 
and 2nd. choices. 
(1st. weighted 2, 


First Choice Second Choice 2nd. weighted 1) 
Mursell 28 Pressey and Mursell 67 
Pressey and Robinson 21 Pressey and 

Robinson 17 Skinner et al 12 Robinson 55 
Skinner et al 16 Mursell 11 Skinner 44 
Hepner 11 Hepner 11 Hepner 33 
Woodruff 4 Gates et al 9 Woodruff 15 
Gates et al 3 Woodruff 7 Gates 15 
Kingsley 1 Kingsley 7 Kingsley 9 


2) If made available to them, which text that you used* do 
you think would probably have the most effect in improving the 
teaching of colleagues with whom you have worked? Second 
choice? 


Combination of Ist. 
and 2nd. choices. 
(1st. weighted 2, 


First Choice Second Choice 2nd. weighted 1) 
Mursell 39 Pressey and Mursell 85 
Pressey and Robinson 21 Pressey and 

Robinson 14 Woodruff 13 Robinson 49 
Woodruff 8 Skinner et al 9 Woodruff 29 
Skinner 8 Kingsley 8 Skinner 25 
Gates 5 Mursell 7 Gates 16 
Kingsley 3 Gates et al 6 Kingsley 14 
Hepner 0 Hepner 5 Hepner 5 


3) Which text of those you used* do you think would probably 
be best for undergraduates without teaching experience? Second 
choice? 





Pressey, seventy-six used Gates, seventy-three used Skinner, seventy-three 
used Kingsley, sixty-five used Mursell and forty-five used Hepner. If all 
had used Hepner, his status would, no doubt, have been raised significantly. 
Mursell’s raised somewhat, Kingsley’s and Skinner’s raised slightly. 
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First Choice 
Pressey and 
Robinson 27 


Gates 22 
Mursell 13 
Skinner 10 
Woodruff 8 
Hepner 3 
Kingsley 3 


Second Choice 


Gates 
Pressey and 
Robinson 
Skinner 
Mursell 
Woodruff 
Kingsley 
Hepner 


22 


19 
13 
10 
9 
4 
2 


Combination of Ist. 
and 2nd. choices. 
(1st. weighted 2, 
2nd. weighted 1) 


Pressey and 
Robinson 
Gates 
Mursell 
Skinner 
Woodruff 
Kingsley 
Hepner 


73 
66 
36 
33 
25 
10 

8 


It should be pointed out that each of the books rated was con- 
sidered quite worth while by both the students and the instructor. 
It is possible that under a different learning situation the ratings 
might have been significantly different. 
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DIAGNOSTIC TESTING AND 
REMEDIAL TEACHING FOR COMMON ERRORS 
IN MECHANICS OF ENGLISH 
MADE BY COLLEGE FRESHMEN 


MARY McGANN 


Worcester Mass. 


This article is a summary of a detailed study! of common 
errors in mechanics of English made by a group of college fresh- 
men. In addition to locating the most frequent errors there was 
an attempt to test the relative efficiency of the individual and 
the group method of remedial instruction. 

The Coéperative English Test A, Form Q, Mechanics of 
Expression, was administered to a class of ninety-seven freshmen, 
sixty-two boys and thirty-five girls, at Clark University, Wor- 
cester, Mass. All items where errors occurred were tabulated 
and ranked, so that thirty items where errors occurred most 
frequently could be analyzed to form the basis of remedial 
instruction. 

Ten of the fifteen leading errors are presented in rank order in 
Table 1. 


TABLE [ 
Errors in Grammar 
Per cent Test 
Failed Item 

79 39) It was one of those editorials that (do, does) more 
to arouse prejudice than to enlighten. 

47 32) The opponents of the bill did not believe that 
either the safety (or, nor) the neutrality of the 
country were threatened. 

47 56) The experience and education of the candidate 
(were, was) carefully outlined in the letters of 
application. 

44 60) The (principle, principal) reason for using this 
process is its economy. 





‘McGann, M. E. Diagnostic Testing and Remedial Teaching Applied io 
the Errors of College Freshmen in English. Worcester, Mass.: Clark Univer- 
sity, M.A. Thesis, 1943. 
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Per cent Test 
Failed Item 


42 


40 


37 


36 


34 


33 


46 


44 


40 
33 
31 
29 


27 
25 
24 
24 
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29) 


59) 


52) 
40) 
38) 


43) 


18) 
35) 


43) 
11) 
50) 
24) 


41) 
49) 
44) 
10) 


TABLE I—(Continued) 


They frankly expressed their approval of the sen- 
ator (who, whom) they told us had refused to 
accept the compromise. 

The merchandise which the natives are most 
eager to buy (are, is) trinkets, tools, and cotton 
cloth. 

The application of the principles discovered dur- 
ing those years (have, has) been of great value. 
When he had drawn his chair up to the table, she 
(sat, set) hot food before him. 

The change had little (affect, effect) on the effi- 
ciency of the machine. 

He agreed that, in (principal, principle), honesty 
is the best policy. 


Errors in punctuation-capitalization 


. . . he was one of the generals put in command 
of the forces in the west. 








. . . he replied in Topeka Kansas, . . . but my 
father . . . was soon transferred to Philadelphia 

. womens dresses .. . - 
I was born he replied in... . 





. with uncle Tom ... 


As the evening light fades the trees and shrubs 
blur together only the white flowers bordering the 


path remain distinct. 
. in the hot crowded room .. . 


John looked up and said, there isno.. . 
. . an occasional rasping cough... . 
I was born he replied in. . . 











In a similar manner all other items where errors occurred were 
listed in rank order. Results of this diagnosis revealed that: 

1) Most errors in grammar were due to lack of knowledge of 
the various forms that a subject of a sentence may take, resulting 
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in the wrong choice of verb, or errors of agreement of subject with 
predicate. 

2) Most errors in punctuation involved the use of quotation 
marks in direct discourse, apostrophe in plural possession, and 
semicolon used in place of a conjunction. 

3) Most error: in capitalization were based on section of the 
country, family relationships followed by the proper name, and 
adjectives derived from proper nouns. 

To test the relative efficiency of the individual and group 
methods of teaching, three equivalent groups of students were 
arranged. The three groups were made comparable for experi- 
mental purposes by equating them on the basis of medians and 
averages on an English test and on a test for scholastic aptitude. 
One section was instructed by the individual conference method, 
one was taught by a group method, and the third was held as a 
control. 

Remedial instruction was given by the author to both experi- 
mental groups. To make the instruction identical, mimeo- 
graphed drill lessons, based on the samples of the thirty leading 
errors in the two subtests (grammar and punctuation-capitaliza- 
tion) were prepared so that every student had copies for both 
study and drill. The individual group had two fifty-minute les- 
sons and the group method group had four twenty-five-minute 
lessons, making a total of one hundred minutes of instruction per 
student in each group. 

The first step in teaching was to return the scored test papers 
where errors were marked in red pencil, identifying specific errors 
for individual pupils. Next, each error was related to the drill 
sentences, involving the same kind of construction, on the mimeo- 
graphed drill sheets. The teacher, together with the class, dis- 
cussed the error items according to rank, and explained the 
correct usage by means of logic as far as possible. This explana- 
tion was supplemented by illustrations and explanations on the 
blackboard. All blackboard work was previously planned to 
include identical illustrations for both groups. As each item was 
explained, attention was directed to the mimeographed drill per- 
taining to it. Oral drill was followed by written drill on these 
sentences, and since every student had his own copy, he rapidly 
underscored, capitalized, or punctuated, as was required. Writ- 
ten drill was followed by self-correction under guidance of the 
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teacher, supplying the correct form orally. The last lesson was 
areview. At the end of the remedial lessons every student was 
again permitted to see his original test paper and check on his 
errors. The same procedure was followed in individual and in 
group instruction. At the conclusion a retest was given with a 
comparable form of the initial test, Form S. The results were 
tabulated with special attention to gains. 

The actual number of points gained or the differences between 
the raw scores on the initial test and the retest formed the basis 
of comparisons between the experimental groups. The results 
for the combined subtests—grammar and punctuation-capitaliza- 
tion—showed that the individual group gained the most, the 
mean gain being 19.79 points; the section instructed by the group 
method had a mean gain of 18.11 points; the control group gained 
the least, 16.41 points. 

The mean differences of the instructed groups over the control 
were small, and not statistically reliable. The individual group 
had the highest mean difference over the control; namely, 3.38 
points, and the section taught by the group method had 1.70 
points for the mean difference over the control. 

In the separate subtests it was revealed that in grammar the 
individual group gained the most, 5.29 points mean gain; the 
section taught by the group method had a mean gain of 3.74 
points, and the control group gained 4.45 points. In punctua- 
tion-capitalization the section taught by the group method 
gained the most, the mean gain being 14.84 points; the individual 
group gained 14.54 points. Thus, results in grammar showed 
that the individually instructed group gained more than the sec- 
tion instructed as a group, but results in punctuation-capitaliza- 
tion showed that the group method of instruction was slightly 
superior. 

A tabulation of errors revealed a total of 1191 made in the 
grammar subtest, and 736 in the punctuation-capitalization sub- 
test. When these errors were separated according to sex groups, 
it was revealed that each boy, on the average, made 24.2 errors 
for the combination subtests, and each girl made only 12.1 
errors, or half as many errors as the boys made on the total 
test. In grammar the boys averaged 15.2 errors, and the girls, 
7.1; in punctuation-capitalization the boys averaged 9.0 errors, 
and the girls, 5.0. Such results indicate that mechanics of Eng- 
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lish should be given more careful attention by those who teach 
boys. 


SUMMARY AND CONCLUSION 


A study of most common errors in mechanics of English made 
by college freshmen on the Codperative Test in English, Form Q, 
was undertaken to test the relative efficiency of the group and the 
individual methods of remedial instruction. Lessons were based 
on thirty leading errors in agreement between subject and predi- 
cate, quotation marks, apostrophe for plural possession, semicolon 
in place of a conjunction, and capitalization of words referring to 
a section of a country (not direction), expressions involving the 
use of a proper name, and adjectives derived from a proper noun. 
Equivalent lessons were taught to equated groups so that there 
was a total of one hundred minutes of remedial] instruction before 
the retest was made. The gains were then tabulated with the 
following results: 

1) On both subtests, taken as a whole, the individual group 
had the highest mean improvement over the control, 3.38 points. 
The section taught by the group method had a mean gain of 1.70 
points over the control. 

2) When results on the two separate subtests were considered, 
the individual group gained slightly more in grammar, while the 
section taught by the group method gained more in punctuation- 
capitalization. These differences, however, were smal! and not 
statistically reliable. 

This finding that the two methods are about equal in efficiency, 
under the conditions of the study, is of practical interest. 














BOOK REVIEW 


R. S. WoopwortH anp D. G. Marauis. Psychology. Fifth 
Edition. New York: Henry Holt and Co., 1947, pp. 677. 


This edition represents some improvement in a book that has 
already established an enviable record in previous editions. 
Perhaps the most marked feature of the revision is its reorgani- 
zation, although some space is given to new material. The 
addition of new topics like projective tests of personality and 
frustration and aggression, a greater emphasis upon reinforce- 
ment and extinction, and the reintroduction of a chapter on : 
mental development reflect the general development of the field 
of psychology in recent years. In fact, a student could gain a 
fair idea of the development of modern psychology by examin- 
ing the various editions from 1921 to 1947. Chapters on will, 
instinct, and association had disappeared by the advent of the 
third edition (1934). Through the successive editions increas- 
ing emphasis has been placed upon individual differences—intel- 
ligence, personality, mental development, etc; the treatment of 
learning and memory has reflected the steady development of 
these areas, while sensation and the nervous system have changed 
least. The book has assumed a bit more of an applied character 
through the various editions. 

This edition remains essentially the same as the third and 
fourth editions. There have been no dramatic changes. The 
aim, point of view, and doubtless the conception of the function 
of the introductory course in psychology have not changed 
appreciably. The authors ‘‘have introduced new material when 
feasible, eliminated some old material that the beginning student 
will not miss, simplified many passages with the avoidance of 
superfluous technical terms and synonyms, and clarified the 
organization both of separate chapters and of the book as a 
whole.”” This is an adequate statement of the case for the fifth 
edition. Quite a few halftones have been added. Generally 
they are poorly done and do not add to the attractiveness of the 
volume. J. B. Stroup 


State University of Iowa 
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